Multi-frame Collaboration for Effective Endoscopic Video Polyp Detection via Spatial-Temporal Feature Transformation

نویسندگان

چکیده

Precise localization of polyp is crucial for early cancer screening in gastrointestinal endoscopy. Videos given by endoscopy bring both richer contextual information as well more challenges than still images. The camera-moving situation, instead the common camera-fixed-object-moving one, leads to significant background variation between frames. Severe internal artifacts (e.g. water flow human body, specular reflection tissues) can make quality adjacent frames vary considerately. These factors hinder a video-based model effectively aggregate features from neighborhood and give better predictions. In this paper, we present Spatial-Temporal Feature Transformation (STFT), multi-frame collaborative framework address these issues. Spatially, STFT mitigates inter-frame variations situation with feature alignment proposal-guided deformable convolutions. Temporally, proposes channel-aware attention module simultaneously estimate correlation adaptive aggregation. Empirical studies superior results demonstrate effectiveness stability our method. For example, improves image baseline FCOS \(10.6\%\) \(20.6\%\) on comprehensive F1-score task CVC-Clinic ASUMayo datasets, respectively, outperforms state-of-the-art method \(3.6\%\) \(8.0\%\), respectively. Code available at https://github.com/lingyunwu14/STFT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Polyp Detection in Endoscopic Video Using SVMs

Colon cancer is one of the most common cancers in developed countries. Most of these cancers start with a polyp. Polyps are easily detected by physicians. Our goal is to mimic this detection ability so that endoscopic videos can be pre-scanned with our algorithm before the physician analyses them. The method will indicate which part of the video needs attention (polyps were detected there) and ...

متن کامل

Deep Spatial-Temporal Joint Feature Representation for Video Object Detection

With the development of deep neural networks, many object detection frameworks have shown great success in the fields of smart surveillance, self-driving cars, and facial recognition. However, the data sources are usually videos, and the object detection frameworks are mostly established on still images and only use the spatial information, which means that the feature consistency cannot be ens...

متن کامل

tight frame approximation for multi-frames and super-frames

در این پایان نامه یک مولد برای چند قاب یا ابر قاب تولید شده تحت عمل نمایش یکانی تصویر برای گروه های شمارش پذیر گسسته بررسی خواهد شد. مثال هایی از این قاب ها چند قاب های گابور، ابرقاب های گابور و قاب هایی برای زیرفضاهای انتقال پایاست. نشان می دهیم که مولد چند قاب تنک نرمال شده (ابرقاب) یکتا وجود دارد به طوری که مینیمم فاصله را از ان دارد. همچنین مسایل مشابه برای قاب های دوگان مطرح شده و برخی ...

15 صفحه اول

A spatial-temporal approach for video caption detection and recognition

We present a video caption detection and recognition system based on a fuzzy-clustering neural network (FCNN) classifier. Using a novel caption-transition detection scheme we locate both spatial and temporal positions of video captions with high precision and efficiency. Then employing several new character segmentation and binarization techniques, we improve the Chinese video-caption recogniti...

متن کامل

Spatial-Temporal Memory Networks for Video Object Detection

We introduce Spatial-Temporal Memory Networks (STMN) for video object detection. At its core, we propose a novel Spatial-Temporal Memory module (STMM) as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM’s design enables the integration of ImageNet pre-trained backbone CNN weights for both the feature stack as well as the prediction head, which ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-87240-3_29